Multiview LSA: Representation Learning via Generalized CCA

نویسندگان

Pushpendre Rastogi

Benjamin Van Durme

Raman Arora

چکیده

Multiview LSA (MVLSA) is a generalization of Latent Semantic Analysis (LSA) that supports the fusion of arbitrary views of data and relies on Generalized Canonical Correlation Analysis (GCCA). We present an algorithm for fast approximate computation of GCCA, which when coupled with methods for handling missing values, is general enough to approximate some recent algorithms for inducing vector representations of words. Experiments across a comprehensive collection of test-sets show our approach to be competitive with the state of the art.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multiview Representation Learning via Deep CCA for Silent Speech Recognition

Silent speech recognition (SSR) converts non-audio information such as articulatory (tongue and lip) movements to text. Articulatory movements generally have less information than acoustic features for speech recognition, and therefore, the performance of SSR may be limited. Multiview representation learning, which can learn better representations by analyzing multiple information sources simul...

متن کامل

Deep Generalized Canonical Correlation Analysis

We present Deep Generalized Canonical Correlation Analysis (DGCCA) – a method for learning nonlinear transformations of arbitrarily many views of data, such that the resulting transformations are maximally informative of each other. While methods for nonlinear two-view representation learning (Deep CCA, (Andrew et al., 2013)) and linear many-view representation learning (Generalized CCA (Horst,...

متن کامل

Multiview Fisher Discriminant Analysis

CCA can be seen as a multiview extension of PCA, in which information from two sources is used for learning by finding a subspace in which the two views are most correlated. However PCA, and by extension CCA, does not use label information. Fisher Discriminant Analysis uses label information to find informative projections, which can be more informative in supervised learning settings. We show ...

متن کامل

Canonical Correlation Analysis for Multiview Semisupervised Feature Extraction

Hotelling’s Canonical Correlation Analysis (CCA) works with two sets of related variables, also called views, and its goal is to find their linear projections with maximal mutual correlation. CCA is most suitable for unsupervised feature extraction when given two views but it has been also long known that in supervised learning when there is only a single view of data given, the supervision sig...

متن کامل